The design of a generic data synchronizer, or, an [object] that does [actions] with the aid of [helpers]

Posted by acheong87 on Programmers See other posts from Programmers or by acheong87
Published on 2012-07-01T14:58:45Z Indexed on 2012/07/01 15:22 UTC
Read the original article Hit count: 229

I'd like to create a generic data-source "synchronizer," where data-source "types" may include MySQL databases, Google Spreadsheets documents, CSV files, among others. I've been trying to figure out how to structure this in terms of classes and interfaces, keeping in mind (what I've read about) composition vs. inheritance and is-a vs. has-a, but each route I go down seems to violate some principle.

For simplicity, assume that all data-sources have a header-row-plus-data-rows format. For example, assume that the first rows of Google Spreadsheets documents and CSV files will have column headers, a.k.a. "fields" (to parallel database fields).

Also, eventually, I would like to implement this in PHP, but avoiding language-specific discussion would probably be more productive.

Here's an overview of what I've tried.


Part 1/4: ISyncable

class CMySQL implements ISyncable
    GetFields()     // sql query, pdo statement, whatever
    AddFields()
    RemFields()
    ...
    _dbh
class CGoogleSpreadsheets implements ISyncable
    GetFields()     // zend gdata api
    AddFields()
    RemFields()
    ...
    _spreadsheetKey
    _worksheetId
class CCsvFile implements ISyncable
    GetFields()     // read from buffer
    AddFields()
    RemFields()
    ...
    _buffer
interface ISyncable
    GetFields()
    AddFields($field1, $field2, ...)
    RemFields($field1, $field2, ...)
    ...
    CanAddFields()  // maybe the spreadsheet is locked for write, or
    CanRemFields()  // maybe no permission to alter a database table
    ...
    AddRow()
    ModRow()
    RemRow()
    ...
    Open()
    Close()
    ...

First Question: Does it make sense to use an interface, as above?


Part 2/4: CSyncer

Next, the thing that does the syncing.

class CSyncer
    __construct(ISyncable $A, ISyncable $B)
    Push()          // sync A to B
    Pull()          // sync B to A
    Sync()          // Push() and Pull() only differ in direction; factor.
                    // Sync()'s job is to make sure that the fields on each side
                    // match, to add fields where appropriate and possible, to
                    // account for different column-orderings, etc., and of
                    // course, to add and remove rows as necessary to sync.
    ...
    _A
    _B

Second Question: Does it make sense to define such a class, or am I treading dangerously close to the "Kingdom of Nouns"?


Part 3/4: CTranslator? ITranslator?

Now, here's where I actually get lost, assuming the above is passable.

Sometimes, two ISyncables speak different "dialects."

For example, believe it or not, Google Spreadsheets (accessed through the Google Data API "list feed") returns column headers lower-cased and stripped of all spaces and symbols! That is, sys_TIMESTAMP is systimestamp, as far as my code can tell. (Yes, I am aware that the "cell feed" does not strip the name so; however cell-by-cell manipulation is too slow for what I'm doing.)

One can imagine other hypothetical examples. Perhaps even the data itself can be in different "dialects." But let's take it as given for now, and not argue this if possible.

Third Question: How would you implement "translation"?

Note: Taking all this as an exercise, I'm more interested in the "idealized" design, rather than the practical one. (God knows that shipped sailed when I began this project.)


Part 4/4: Further Thought

Here's my train of thought to demonstrate I've thunk, albeit unfruitfully:

  1. First, I thought, primitively, "I'll just modify CMySQL::GetFields() to lower-case and strip field names so they're compatible with Google Spreadsheets." But of course, then my class should really be called, CMySQLForGoogleSpreadsheets, and that can't be right.

  2. So, the thing which translates must exist outside of an ISyncable implementor.

  3. And surely it can't be right to make each translation a method in CSyncer.

  4. If it exists outside of both ISyncable and CSyncer, then what is it? (Is it even an "it"?)

  5. Is it an abstract class, i.e. abstract CTranslator?

  6. Is it an interface, since a translator only does, not has, i.e. interface ITranslator?

  7. Does it even require instantiation? e.g. If it's an ITranslator, then should its translation methods be static? (I learned what "late static binding" meant, today.)

And, dear God, whatever it is, how should a CSyncer use it? Does it "have" it? Is it, "it"?

Who am I? ...am I, "I"?


I've attempted to break up the question into sub-questions, but essentially my question is singular:

How does one implement an object A that conceptually "links" (has) two objects b1 and b2 that share a common interface B, where certain pairs of b1 and b2 require a helper, e.g. a translator, to be handled by A?

Something tells me that I've overcomplicated this design, or violated a principle much higher up.

Thank you all very much for your time and any advice you can provide.

© Programmers or respective owner

Related posts about c++

Related posts about php